10 research outputs found
A Masked Face Classification Benchmark on Low-Resolution Surveillance Images
We propose a novel image dataset focused on tiny faces wearing face masks for
mask classification purposes, dubbed Small Face MASK (SF-MASK), composed of a
collection made from 20k low-resolution images exported from diverse and
heterogeneous datasets, ranging from 7 x 7 to 64 x 64 pixel resolution. An
accurate visualization of this collection, through counting grids, made it
possible to highlight gaps in the variety of poses assumed by the heads of the
pedestrians. In particular, faces filmed by very high cameras, in which the
facial features appear strongly skewed, are absent. To address this structural
deficiency, we produced a set of synthetic images which resulted in a
satisfactory covering of the intra-class variance. Furthermore, a small
subsample of 1701 images contains badly worn face masks, opening to multi-class
classification challenges. Experiments on SF-MASK focus on face mask
classification using several classifiers. Results show that the richness of
SF-MASK (real + synthetic images) leads all of the tested classifiers to
perform better than exploiting comparative face mask datasets, on a fixed 1077
images testing set. Dataset and evaluation code are publicly available here:
https://github.com/HumaticsLAB/sf-maskComment: 15 pages, 7 figures. Accepted at T-CAP workshop @ ICPR 202
Language-enhanced RNR-Map: Querying Renderable Neural Radiance Field maps with natural language
We present Le-RNR-Map, a Language-enhanced Renderable Neural Radiance map for
Visual Navigation with natural language query prompts. The recently proposed
RNR-Map employs a grid structure comprising latent codes positioned at each
pixel. These latent codes, which are derived from image observation, enable: i)
image rendering given a camera pose, since they are converted to Neural
Radiance Field; ii) image navigation and localization with astonishing
accuracy. On top of this, we enhance RNR-Map with CLIP-based embedding latent
codes, allowing natural language search without additional label data. We
evaluate the effectiveness of this map in single and multi-object searches. We
also investigate its compatibility with a Large Language Model as an
"affordance query resolver". Code and videos are available at
https://intelligolabs.github.io/Le-RNR-Map/Comment: Accepted at ICCVW23 VLA
A Machine Learning-oriented Survey on Tiny Machine Learning
The emergence of Tiny Machine Learning (TinyML) has positively revolutionized
the field of Artificial Intelligence by promoting the joint design of
resource-constrained IoT hardware devices and their learning-based software
architectures. TinyML carries an essential role within the fourth and fifth
industrial revolutions in helping societies, economies, and individuals employ
effective AI-infused computing technologies (e.g., smart cities, automotive,
and medical robotics). Given its multidisciplinary nature, the field of TinyML
has been approached from many different angles: this comprehensive survey
wishes to provide an up-to-date overview focused on all the learning algorithms
within TinyML-based solutions. The survey is based on the Preferred Reporting
Items for Systematic Reviews and Meta-Analyses (PRISMA) methodological flow,
allowing for a systematic and complete literature survey. In particular,
firstly we will examine the three different workflows for implementing a
TinyML-based system, i.e., ML-oriented, HW-oriented, and co-design. Secondly,
we propose a taxonomy that covers the learning panorama under the TinyML lens,
examining in detail the different families of model optimization and design, as
well as the state-of-the-art learning techniques. Thirdly, this survey will
present the distinct features of hardware devices and software tools that
represent the current state-of-the-art for TinyML intelligent edge
applications. Finally, we discuss the challenges and future directions.Comment: Article currently under review at IEEE Acces
IoT Systems for Healthy and Safe Life Environments
The past two years have been sadly marked by the worldwide spread of the SARS-Cov-19 pandemic. The first line of defense against this and other pandemic threats is to respect interpersonal distances, use masks, and sanitize hands, air, and objects. Some of these countermeasures are becoming part of our daily lives, as they are now considered good practices to reduce the risk of infection and contagion. In this context, we present \emph{Safe Place}, a modular system enabled by \gls{iot} that is designed to improve the safety and healthiness of living environments. %\textcolor{blue}{ This system combines several sensors and actuators produced by different vendors with self-regulating procedures and \gls{ai} algorithms to limit the spread of viruses and other pathogens, and increase the quality and comfort offered to people while minimizing the energy consumption.%} We discuss the main objectives of the system and its implementation, showing preliminary results that assess its potentials in enhancing the conditions of living and working spaces
Split-Et-Impera: A Framework for the Design of Distributed Deep Learning Applications
Many recent pattern recognition applications rely on complex distributed architectures in which sensing and computational nodes interact together through a communication network. Deep neural networks (DNNs) play an important role in this scenario, furnishing powerful decision mechanisms, at the price of a high computational effort. Consequently, powerful state-of-the-art DNNs are frequently split over various computational nodes, e.g., a first part stays on an embedded device and the rest on a server. Deciding where to split a DNN is a challenge in itself, making the design of deep learning applications even more complicated. Therefore, we propose Split-Et-Impera, a novel and practical framework that i) determines the set of the best-split points of a neural network based on deep network interpretability principles without performing a tedious try-and-test approach, ii) performs a communication-aware simulation for the rapid evaluation of different neural network rearrangements, and iii) suggests the best match between the quality of service requirements of the application and the performance in terms of accuracy and latency time
I-SPLIT: Deep Network Interpretability for Split Computing
This work makes a substantial step in the field of split computing, i.e., how to split a deep neural network to host its early part on an embedded device and the rest on a server. So far, potential split locations have been identified exploiting uniquely architectural aspects, i.e., based on the layer sizes. Under this paradigm, the efficacy of the split in terms of accuracy can be evaluated only after having performed the split and retrained the entire pipeline, making an exhaustive evaluation of all the plausible splitting points prohibitive in terms of time. Here we show that not only the architecture of the layers does matter, but the importance of the neurons contained therein too. A neuron is important if its gradient with respect to the correct class decision is high. It follows that a split should be applied right after a layer with a high density of important neurons, in order to preserve the information flowing until then. Upon this idea, we propose Interpretable Split (I-SPLIT): a procedure that identifies the most suitable splitting points by providing a reliable prediction on how well this split will perform in terms of classification accuracy, beforehand of its effective implementation. As a further major contribution of I-SPLIT, we show that the best choice for the splitting point on a multiclass categorization problem depends also on which specific classes the network has to deal with. Exhaustive experiments have been carried out on two networks, VGG16 and ResNet-50, and three datasets, Tiny-Imagenet-200, notMNIST, and Chest X-Ray Pneumonia. The source code is available at https://github.com/vips4/I-Split
SCENE-pathy: Capturing the Visual Selective Attention of People Towards Scene Elements
We present SCENE-pathy, a dataset and a set of baselines to study the visual selective attention (VSA) of people towards the 3D scene in which they are located. In practice, VSA allows to discover which parts of the scene are most attractive for an individual. Capturing VSA is of primary importance in the fields of marketing, retail management, surveillance, and many others. So far, VSA analysis focused on very simple scenarios: a mall shelf or a tiny room, usually with a single subject involved. Our dataset, instead, considers a multi-person and much more complex 3D scenario, specifically a high-tech fair showroom presenting machines of an Industry 4.0 production line, where 25 subjects have been captured for 2Â min each when moving, observing the scene, and having social interactions. Also, the subjects filled out a questionnaire indicating which part of the scene was most interesting for them. Data acquisition was performed using Hololens 2 devices, which allowed us to get ground-truth data related to people's tracklets and gaze trajectories. Our proposed baselines capture VSA from the mere RGB video data and a 3D scene model, providing interpretable 3D heatmaps. In total, there are more than 100K RGB frames with, for each person, the annotated 3D head positions and the 3D gaze vectors. The dataset is available here: https://intelligolabs.github.io/scene-pathy
Pose Forecasting in Industrial Human-Robot Collaboration
Pushing back the frontiers of collaborative robots in industrial environments, we propose a new Separable-Sparse Graph Convolutional Network (SeS-GCN) for pose forecasting. For the first time, SeS-GCN bottlenecks the interaction of the spatial, temporal and channel-wise dimensions in GCNs, and it learns sparse adjacency matrices by a teacher-student framework. Compared to the state-of-the-art, it only uses 1.72% of the parameters and it is ∼4 times faster, while still performing comparably in forecasting accuracy on Human3.6M at 1 s in the future, which enables cobots to be aware of human operators. As a second contribution, we present a new benchmark of Cobots and Humans in Industrial COllaboration (CHICO ). CHICO includes multi-view videos, 3D poses and trajectories of 20 human operators and cobots, engaging in 7 realistic industrial actions. Additionally, it reports 226 genuine collisions, taking place during the human-cobot interaction. We test SeS-GCN on CHICO for two important perception tasks in robotics: human pose forecasting, where it reaches an average error of 85.3 mm (MPJPE) at 1 sec in the future with a run time of 2.3 ms, and collision detection, by comparing the forecasted human motion with the known cobot motion, obtaining an F1-score of 0.64
I-MALL An Effective Framework for Personalized Visits. Improving the Customer Experience in Stores
In this paper we present I-MALL, an ICT hardware and software infrastructure that enables the management of services related to places such as shopping malls, showrooms, and conferences held in dedicated facilities. I-MALL offers a network of services that perform customer behavior analysis through computer vision and provide personalized recommendations made available on digital signage terminals. The user can also interact with a social robot. Recommendations are inferred on the basis of the profile of interests computed by the system analysing the history of the customer visit and his/her behavior including information from his/her appearance, the route taken inside the facility, as well as his/her mood and gaze
The Post-pandemic Effects on IoT for Safety: The Safe Place Project
COVID-19 had substantial effects on the IoT community which designs systems for safety: the urge to face masks worn by everyone, the analysis of crowds to avoid the spread of the disease, and the sanitization of public environments has led to exceptional research acceleration and fast engineering of the related solutions. Now that the pandemic is losing power, some applications are becoming less important, while others are proving to be useful regardless of the criticality of COVID-19. The Safe Place project is a prime example of this situation (DATE23 MPP category: final stage). Safe Place is an Italian 3M euro regional industrial/academic project, financed by European funds, created to ensure a multidisciplinary choral reaction to COVID-19 in critical environments such as rest homes and public places. Safe Place consortium was able to understand what is no longer useful in this post-pandemic period, and what instead is potentially attractive for the market. For example, the detection of face masks has little importance, while sanitization does have much. This paper shares such analysis, which emerged through a co-design process of three public Safe Place project demonstrators, involving heterogeneous figures spanning from scientists to lawyers